-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allocate timing counts on-the-fly + support dynamic events #49901
Conversation
src/timing.c
Outdated
for (size_t i = 0; i < n; i++) { | ||
jl_timing_counts_event_t *other_event = (jl_timing_counts_event_t *)jl_timing_counts_events.items[i]; | ||
if (strcmp(event, other_event->name) == 0) | ||
return other_event; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to see this be a hash table lookup or something faster than linear time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this event-creation overhead is going to matter much until we start creating many more events (we have less than 100 now)
Can that be tackled in a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah we can push this to a follow up.
src/timing.c
Outdated
#endif | ||
} | ||
|
||
JL_DLLEXPORT jl_timing_event_t *jl_timing_event_create(const char *subsystem, const char *name, const char *function, const char *file, int line, int color) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like we should deduplicate at this point, rather than relying on each individual event system to handle deduplication. I think at this point we should only have a 1:1 mapping of event name to source location, so there isn't a reason to only deduplicate for counts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already de-duplicate at the source location level, but that's the caller's responsibility
For any de-duplication that goes beyond that, I think splitting to the back-ends makes sense since they all use slightly different information.
For counts specifically, being able to re-use event names in multiple locations seems like a nice feature, and the de-duplication gives us the flexibility to group counts by subsystem in the future if we want to, so I don't really want to rip out the counts event de-dupe either.
e65346e
to
15b9c94
Compare
cadbf29
to
a1c7f04
Compare
a1c7f04
to
7d9ec06
Compare
|
I think the wrong zones are disabled with this change since there is now a mismatch between getting the sorted index when the zone is created vs getting the unsorted index when Something like 903ce26 seems to fix it. |
7d9ec06
to
58320fa
Compare
Looks like a rebasing casualty from yesterday - Thanks for the heads up! Should be fixed now, and event sorting is also back in. |
This paves the way for a Julia-side API that can create new events on-the-fly without having to modify timing.h The core of the change is to introduce two different structs: - An "event" stores all of the statically-determined attributes of a profiler event (typically, zone name and source location info) - A "timing block" stores the dynamic information relevant to a particular span/measurement in the timing run Events and timing blocks have a one-to-many relationship. The intended pattern for a Julia-side API is to construct an event once at parse-time using `jl_timing_event_create` (since this is relatively expensive due to profiler traffic and allocations) and then to create its own timing block on-the-fly for each block entry/exit. This also re-factors the API a bit to hopefully be more consistently named
58320fa
to
3c915f3
Compare
VTune has been tested with a bit of help from @gbaraldi (thanks!) and is looking OK Timing counts and Tracy are also looking good. This change should be good to land once it gets review. |
* fixup for 49996 and add test from 50065 --------- Co-authored-by: Lilith Hafner <[email protected]>
This paves the way for a Julia-side API that can create new events on-the-fly without having to modify timing.h
The core of the change is to split our profiling data into two different structs:
jl_timing_event_t
: stores statically-determined attributes of a profiler event (typically, zone name and source location info)jl_timing_block_t
: stores the dynamic information relevant to a particular span/measurement in the timing runEvents and timing blocks have a one-to-many relationship, and both can be allocated dynamically. The intended pattern for a Julia-side API is to construct an event once at parse-time using "jl_timing_event_create" (since this is relatively expensive due to profiler traffic and allocations) and then to create its own timing block on-the-fly for each block entry/exit.
P.S. @pchintalapudi I'm planning to re-base this on top of #49679 as soon as that landsDone